automatically extract
Natural language processing to automatically extract the presence and severity of esophagitis in notes of patients undergoing radiotherapy
Chen, Shan, Guevara, Marco, Ramirez, Nicolas, Murray, Arpi, Warner, Jeremy L., Aerts, Hugo JWL, Miller, Timothy A., Savova, Guergana K., Mak, Raymond H., Bitterman, Danielle S.
Radiotherapy (RT) toxicities can impair survival and quality-of-life, yet remain under-studied. Real-world evidence holds potential to improve our understanding of toxicities, but toxicity information is often only in clinical notes. We developed natural language processing (NLP) models to identify the presence and severity of esophagitis from notes of patients treated with thoracic RT. We fine-tuned statistical and pre-trained BERT-based models for three esophagitis classification tasks: Task 1) presence of esophagitis, Task 2) severe esophagitis or not, and Task 3) no esophagitis vs. grade 1 vs. grade 2-3. Transferability was tested on 345 notes from patients with esophageal cancer undergoing RT. Fine-tuning PubmedBERT yielded the best performance. The best macro-F1 was 0.92, 0.82, and 0.74 for Task 1, 2, and 3, respectively. Selecting the most informative note sections during fine-tuning improved macro-F1 by over 2% for all tasks. Silver-labeled data improved the macro-F1 by over 3% across all tasks. For the esophageal cancer notes, the best macro-F1 was 0.73, 0.74, and 0.65 for Task 1, 2, and 3, respectively, without additional fine-tuning. To our knowledge, this is the first effort to automatically extract esophagitis toxicity severity according to CTCAE guidelines from clinic notes. The promising performance provides proof-of-concept for NLP-based automated detailed toxicity monitoring in expanded domains.
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
Valls-Vargas
Storytelling and story generation systems usually require knowledge about the story world to be encoded in some form of knowledge representation formalism, a notoriously time-consuming task requiring expertise in storytelling and knowledge engineering. In order to alleviate this authorial bottleneck, in this paper we propose an end-to-end computational narrative system that automatically extracts the necessary domain knowledge from corpus of stories written in natural language and then uses such domain knowledge to generate new stories. Specifically, we employ narrative information extraction techniques that can automatically extract structured representations from stories and feed those representations to an analogy-based story generation system. We present the structures we used to connect two existing computational narrative systems and report our experiments using a dataset of Russian fairy tales. Specifically we look at the perceived quality of the final natural language being generated and how errors in the pipeline affect the output.
Intel acquires AI optimisation platform SigOpt for undisclosed sum
It's hoped that the combination of SigOpt's AI scaling software and Intel's hardware will give Intel competitive advantages in emerging tech. San Francisco's SigOpt is being acquired by Intel for an undisclosed sum. SigOpt's platform enables the optimisation of artificial intelligence (AI) software models at scale. Its present customer base includes Fortune 500 companies across different industries, as well as leading research institutions, universities and consortiums. With this acquisition, Intel plans to use SigOpt's software across its own AI hardware products to accelerate and grow its AI offerings to developers.
Big data and AI: 3 real-world use cases
The relationship between AI and big data is a two-way street, to be sure: Artificial intelligence success depends largely on high-quality data, and lots of it. Managing massive amounts of data and deriving value from it, meanwhile, increasingly depends upon technologies such as machine learning (ML) or natural language processing (NLP) to solve problems that would be too burdensome for humans to contend with on their own. It's a "virtuous cycle," as Anexinet senior digital strategist Glenn Gruber told us recently. Whereas the "big" in big data once might have been seen more as a challenge than an opportunity, this is changing as organizations begin rolling out enterprise uses of machine learning and other AI disciplines. "Today, we want as much [data] as we can get – not only to drive better insight into business problems we're trying to solve, but because the more data we put through the machine learning models, the better they get," Gruber explained.
Diffbot Sees The Web Like People Do, Now Free For Developers
Diffbot is a geeky and incredibly interesting technology that uses bots, algorithms, computer vision and artificial intelligence to process the content on the Web the way a human being can. "The entire Internet can be broken down into 30 different page types" explains Co-founder Mike Tung, also known as "Diffbot Mike," and "Diffbot can identify them all." Diffbot knows the difference between a social network profile, a blog post, a site's front page, a product page, an event page and dozens more. Today, Diffbot is releasing its first set of APIs, now open to all developers for free. The launch has the potential to dramatically impact the types of applications developers can build, and for consumers, it means a whole host of intelligent applications are about to emerge.